Robust Emotion Recognition using Pitch Synchronous and Sub-syllabic Spectral Features
ثبت نشده
چکیده
This chapter discusses the use of vocal tract information for recognizing the emotions. Linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are used as the correlates of vocal tract information. In addition to LPCCs and MFCCs, formant related features are also explored in this work for recognizing emotions from speech. Extraction of the above mentioned spectral features is discussed in brief. Further extraction of these features from subsyllabic regions such as consonants, vowels and consonant-vowel transition regions is discussed. Extraction of spectral features from pitch synchronous analysis is also discussed. Basic philosophy and use of Gaussian mixture models is discussed in this chapter for classifying the emotions. The emotion recognition performance obtained from different vocal tract features is compared. Proposed spectral features are evaluated on Indian and Berlin emotion databases. Performance of Gaussian mixture models in classifying the emotional utterances using vocal tract features is compared with neural network models.
منابع مشابه
Classification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملImproving automatic emotion recognition from speech signals
We present a speech signal driven emotion recognition system. Our system is trained and tested with the INTERSPEECH 2009 Emotion Challenge corpus, which includes spontaneous and emotionally rich recordings. The challenge includes classifier and feature sub-challenges with five-class and two-class classification problems. We investigate prosody related, spectral and HMM-based features for the ev...
متن کاملRecognition and Classification of Human Emotion from Audio
In this paper, the audio emotion recognition system is proposed that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio paths. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). Me...
متن کاملEmotion Recognition Using Neural Network: A Comparative Study
Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifi...
متن کامل